13 research outputs found

    STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset

    Full text link
    In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention. In this paper, we particularly consider generating Japanese captions for images. Since most available caption datasets have been constructed for English language, there are few datasets for Japanese. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO, which is called STAIR Captions. STAIR Captions consists of 820,310 Japanese captions for 164,062 images. In the experiment, we show that a neural network trained using STAIR Captions can generate more natural and better Japanese captions, compared to those generated using English-Japanese machine translation after generating English captions.Comment: Accepted as ACL2017 short paper. 5 page

    Ridge Regression, Hubness, and Zero-Shot Learning

    Full text link
    This paper discusses the effect of hubness in zero-shot learning, when ridge regression is used to find a mapping between the example space to the label space. Contrary to the existing approach, which attempts to find a mapping from the example space to the label space, we show that mapping labels into the example space is desirable to suppress the emergence of hubs in the subsequent nearest neighbor search step. Assuming a simple data model, we prove that the proposed approach indeed reduces hubness. This was verified empirically on the tasks of bilingual lexicon extraction and image labeling: hubness was reduced with both of these tasks and the accuracy was improved accordingly.Comment: To be presented at ECML/PKDD 201

    Learning Decorrelated Representations Efficiently Using Fast Fourier Transform

    Full text link
    Barlow Twins and VICReg are self-supervised representation learning models that use regularizers to decorrelate features. Although these models are as effective as conventional representation learning models, their training can be computationally demanding if the dimension d of the projected embeddings is high. As the regularizers are defined in terms of individual elements of a cross-correlation or covariance matrix, computing the loss for n samples takes O(n d^2) time. In this paper, we propose a relaxed decorrelating regularizer that can be computed in O(n d log d) time by Fast Fourier Transform. We also propose an inexpensive technique to mitigate undesirable local minima that develop with the relaxation. The proposed regularizer exhibits accuracy comparable to that of existing regularizers in downstream tasks, whereas their training requires less memory and is faster for large d. The source code is available.Comment: Accepted for CVPR 202

    Action Class Relation Detection and Classification Across Multiple Video Datasets

    Full text link
    The Meta Video Dataset (MetaVD) provides annotated relations between action classes in major datasets for human action recognition in videos. Although these annotated relations enable dataset augmentation, it is only applicable to those covered by MetaVD. For an external dataset to enjoy the same benefit, the relations between its action classes and those in MetaVD need to be determined. To address this issue, we consider two new machine learning tasks: action class relation detection and classification. We propose a unified model to predict relations between action classes, using language and visual information associated with classes. Experimental results show that (i) pre-trained recent neural network models for texts and videos contribute to high predictive performance, (ii) the relation prediction based on action label texts is more accurate than based on videos, and (iii) a blending approach that combines predictions by both modalities can further improve the predictive performance in some cases.Comment: Accepted to Pattern Recognition Letters. 12 pages, 4 figure

    ハブ サクゲン ニ ヨル キンボウホウ ノ セイド カイゼン

    No full text
    博第1399号甲第1399号博士(工学)奈良先端科学技術大学院大

    Improving Nearest Neighbor Methods from the Perspective of Hubness Phenomenon

    No full text

    Sterile protection and transmission blockade by a multistage anti-malarial vaccine in the pre-clinical study.

    No full text
    Peer reviewed: TrueThe Malaria Vaccine Technology Roadmap 2013 (World Health Organization) aims to develop safe and effective vaccines by 2030 that will offer at least 75% protective efficacy against clinical malaria and reduce parasite transmission. Here, we demonstrate a highly effective multistage vaccine against both the pre-erythrocytic and sexual stages of Plasmodium falciparum that protects and reduces transmission in a murine model. The vaccine is based on a viral-vectored vaccine platform, comprising a highly-attenuated vaccinia virus strain, LC16m8Δ (m8Δ), a genetically stable variant of a licensed and highly effective Japanese smallpox vaccine LC16m8, and an adeno-associated virus (AAV), a viral vector for human gene therapy. The genes encoding P. falciparum circumsporozoite protein (PfCSP) and the ookinete protein P25 (Pfs25) are expressed as a Pfs25-PfCSP fusion protein, and the heterologous m8Δ-prime/AAV-boost immunization regimen in mice provided both 100% protection against PfCSP-transgenic P. berghei sporozoites and up to 100% transmission blocking efficacy, as determined by a direct membrane feeding assay using parasites from P. falciparum-positive, naturally-infected donors from endemic settings. Remarkably, the persistence of vaccine-induced immune responses were over 7 months and additionally provided complete protection against repeated parasite challenge in a murine model. We propose that application of the m8Δ/AAV malaria multistage vaccine platform has the potential to contribute to the landmark goals of the malaria vaccine technology roadmap, to achieve life-long sterile protection and high-level transmission blocking efficacy
    corecore